Search CORE

108 research outputs found

LoGAN: Generating Logos with a Generative Adversarial Neural Network Conditioned on color

Author: Mino Ajkel
Spanakis Gerasimos
Publication venue
Publication date: 23/10/2018
Field of study

Designing a logo is a long, complicated, and expensive process for any designer. However, recent advancements in generative algorithms provide models that could offer a possible solution. Logos are multi-modal, have very few categorical properties, and do not have a continuous latent space. Yet, conditional generative adversarial networks can be used to generate logos that could help designers in their creative process. We propose LoGAN: an improved auxiliary classifier Wasserstein generative adversarial neural network (with gradient penalty) that is able to generate logos conditioned on twelve different colors. In 768 generated instances (12 classes and 64 logos per class), when looking at the most prominent color, the conditional generation part of the model has an overall precision and recall of 0.8 and 0.7 respectively. LoGAN's results offer a first glance at how artificial intelligence can be used to assist designers in their creative process and open promising future directions, such as including more descriptive labels which will provide a more exhaustive and easy-to-use system.Comment: 6 page, ICMLA1

arXiv.org e-Print Archive

Maastricht University Research Portal

Crossref

A retrieval-based dialogue system utilizing utterance and context embeddings

Author: Bartl Alexander
Spanakis Gerasimos
Publication venue
Publication date: 20/10/2017
Field of study

Finding semantically rich and computer-understandable representations for textual dialogues, utterances and words is crucial for dialogue systems (or conversational agents), as their performance mostly depends on understanding the context of conversations. Recent research aims at finding distributed vector representations (embeddings) for words, such that semantically similar words are relatively close within the vector-space. Encoding the "meaning" of text into vectors is a current trend, and text can range from words, phrases and documents to actual human-to-human conversations. In recent research approaches, responses have been generated utilizing a decoder architecture, given the vector representation of the current conversation. In this paper, the utilization of embeddings for answer retrieval is explored by using Locality-Sensitive Hashing Forest (LSH Forest), an Approximate Nearest Neighbor (ANN) model, to find similar conversations in a corpus and rank possible candidates. Experimental results on the well-known Ubuntu Corpus (in English) and a customer service chat dataset (in Dutch) show that, in combination with a candidate selection method, retrieval-based approaches outperform generative ones and reveal promising future research directions towards the usability of such a system.Comment: A shorter version is accepted at ICMLA2017 conference; acknowledgement added; typos correcte

arXiv.org e-Print Archive

Maastricht University Research Portal

Massive Open Online Courses Temporal Profiling for Dropout Prediction

Author: Hagedoorn Tom Rolandus
Spanakis Gerasimos
Publication venue
Publication date: 09/10/2017
Field of study

Massive Open Online Courses (MOOCs) are attracting the attention of people all over the world. Regardless the platform, numbers of registrants for online courses are impressive but in the same time, completion rates are disappointing. Understanding the mechanisms of dropping out based on the learner profile arises as a crucial task in MOOCs, since it will allow intervening at the right moment in order to assist the learner in completing the course. In this paper, the dropout behaviour of learners in a MOOC is thoroughly studied by first extracting features that describe the behavior of learners within the course and then by comparing three classifiers (Logistic Regression, Random Forest and AdaBoost) in two tasks: predicting which users will have dropped out by a certain week and predicting which users will drop out on a specific week. The former has showed to be considerably easier, with all three classifiers performing equally well. However, the accuracy for the second task is lower, and Logistic Regression tends to perform slightly better than the other two algorithms. We found that features that reflect an active attitude of the user towards the MOOC, such as submitting their assignment, posting on the Forum and filling their Profile, are strong indicators of persistence.Comment: 8 pages, ICTAI1

arXiv.org e-Print Archive

Maastricht University Research Portal

Accumulated Gradient Normalization

Author: Hermans Joeri
Möckel Rico
Spanakis Gerasimos
Publication venue
Publication date: 06/10/2017
Field of study

This work addresses the instability in asynchronous data parallel optimization. It does so by introducing a novel distributed optimizer which is able to efficiently optimize a centralized model under communication constraints. The optimizer achieves this by pushing a normalized sequence of first-order gradients to a parameter server. This implies that the magnitude of a worker delta is smaller compared to an accumulated gradient, and provides a better direction towards a minimum compared to first-order gradients, which in turn also forces possible implicit momentum fluctuations to be more aligned since we make the assumption that all workers contribute towards a single minima. As a result, our approach mitigates the parameter staleness problem more effectively since staleness in asynchrony induces (implicit) momentum, and achieves a better convergence rate compared to other optimizers such as asynchronous EASGD and DynSGD, which we show empirically.Comment: 16 pages, 12 figures, ACML201

arXiv.org e-Print Archive

Maastricht University Research Portal

Adapting End-to-End Speech Recognition for Readable Subtitles

Author: Liu Danni
Niehues Jan
Spanakis Gerasimos
Publication venue
Publication date: 01/01/2020
Field of study

Automatic speech recognition (ASR) systems are primarily evaluated on transcription accuracy. However, in some use cases such as subtitling, verbatim transcription would reduce output readability given limited screen size and reading time. Therefore, this work focuses on ASR with output compression, a task challenging for supervised approaches due to the scarcity of training data. We first investigate a cascaded system, where an unsupervised compression model is used to post-edit the transcribed speech. We then compare several methods of end-to-end speech recognition under output length constraints. The experiments show that with limited data far less than needed for training a model from scratch, we can adapt a Transformer-based ASR model to incorporate both transcription and compression capabilities. Furthermore, the best performance in terms of WER and ROUGE scores is achieved by explicitly modeling the length constraints within the end-to-end ASR system.Comment: IWSLT 202

arXiv.org e-Print Archive

Maastricht University Research Portal

Crossref

Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection

Author: Liu Danni
Niehues Jan
Spanakis Gerasimos
Publication venue
Publication date: 01/01/2020
Field of study

Encoder-decoder models provide a generic architecture for sequence-to-sequence tasks such as speech recognition and translation. While offline systems are often evaluated on quality metrics like word error rates (WER) and BLEU, latency is also a crucial factor in many practical use-cases. We propose three latency reduction techniques for chunk-based incremental inference and evaluate their efficiency in terms of accuracy-latency trade-off. On the 300-hour How2 dataset, we reduce latency by 83% to 0.8 second by sacrificing 1% WER (6% rel.) compared to offline transcription. Although our experiments use the Transformer, the hypothesis selection strategies are applicable to other encoder-decoder models. To avoid expensive re-computation, we use a unidirectionally-attending encoder. After an adaptation procedure to partial sequences, the unidirectional model performs on-par with the original model. We further show that our approach is also applicable to low-latency speech translation. On How2 English-Portuguese speech translation, we reduce latency to 0.7 second (-84% rel.) while incurring a loss of 2.4 BLEU points (5% rel.) compared to the offline system

arXiv.org e-Print Archive

Maastricht University Research Portal

Crossref

Towards Controlled Transformation of Sentiment in Sentences

Author: Leeftink Wouter
Spanakis Gerasimos
Publication venue
Publication date: 01/01/2019
Field of study

An obstacle to the development of many natural language processing products is the vast amount of training examples necessary to get satisfactory results. The generation of these examples is often a tedious and time-consuming task. This paper this paper proposes a method to transform the sentiment of sentences in order to limit the work necessary to generate more training data. This means that one sentence can be transformed to an opposite sentiment sentence and should reduce by half the work required in the generation of text. The proposed pipeline consists of a sentiment classifier with an attention mechanism to highlight the short phrases that determine the sentiment of a sentence. Then, these phrases are changed to phrases of the opposite sentiment using a baseline model and an autoencoder approach. Experiments are run on both the separate parts of the pipeline as well as on the end-to-end model. The sentiment classifier is tested on its accuracy and is found to perform adequately. The autoencoder is tested on how well it is able to change the sentiment of an encoded phrase and it was found that such a task is possible. We use human evaluation to judge the performance of the full (end-to-end) pipeline and that reveals that a model using word vectors outperforms the encoder model. Numerical evaluation shows that a success rate of 54.7% is achieved on the sentiment change.Comment: Accepted at ICAART 2019, 8 page

arXiv.org e-Print Archive

Maastricht University Research Portal

Crossref

LoGAN: Generating Logos with a Generative Adversarial Neural Network Conditioned on Color

Author: Mino Ajkel
Spanakis Gerasimos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2018
Field of study

Maastricht University Research Portal

Crossref

Towards Controlled Transformation of Sentiment in Sentences

Author: Leeftink Wouter
Spanakis Gerasimos
Publication venue
Publication date: 01/01/2019
Field of study

arXiv.org e-Print Archive

Publikationer från Linköpings universitet

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Exploring the context of recurrent neural network based conversational agents

Author: Piccini Raffaele
Spanakis Gerasimos
Publication venue
Publication date: 01/01/2019
Field of study

Conversational agents have begun to rise both in the academic (in terms of research) and commercial (in terms of applications) world. This paper investigates the task of building a non-goal driven conversational agent, using neural network generative models and analyzes how the conversation context is handled. It compares a simpler Encoder-Decoder with a Hierarchical Recurrent Encoder-Decoder architecture, which includes an additional module to model the context of the conversation using previous utterances information. We found that the hierarchical model was able to extract relevant context information and include them in the generation of the output. However, it performed worse (35-40%) than the simple Encoder-Decoder model regarding both grammatically correct output and meaningful response. Despite these results, experiments demonstrate how conversations about similar topics appear close to each other in the context space due to the increased frequency of specific topic-related words, thus leaving promising directions for future research and how the context of a conversation can be exploited.Comment: Accepted at ICAART 2019, 10 page

arXiv.org e-Print Archive

Maastricht University Research Portal

Crossref